Often, we are interested in describing the richness and structure or diversity of an assemblage. But more importantly, we are often more interested in comparing richness and diversity across space and/or time, by merging data from multiple sites or from different sampling units within a site.
By richness, we mean the number of unique species or taxonomic/phylogenetic/functional units in the assemblage. By diversity, we usually mean some aspect of the structure of the assemblage - a combination of both the number of unique elements as well as the relative abundance of the different elements.
Overall, biological diversity metrics can be characterized as:
Today, we will be exploring different aspects of estimating richness and diversity. A good resource for this topic, and one on which I have relied heavily in this lecture (in addition to package vignettes), is the book “Biological Diversity” by Magurran and McGill https://global.oup.com/academic/product/biological-diversity-9780199580675?cc=us&lang=en&. This book may be available through your institutional library.
We will need to load all relevant package
r
r # uncomment the following line if you haven’t installed these packages yet # install.packages(c(‘vegan’, ‘neotoma’, ‘dplyr’, ‘sads’, ‘iNEXT’))
library(‘vegan’) library(‘neotoma’) library(‘dplyr’) library(‘sads’) library(‘iNEXT’)
Let’s start with a simple search for a single site in the Neotoma Paleoecology Database https://www.neotomadb.org/. We’ll download mammal data from one site (late Quaternary cave deposit) in northern California, Samwell Cave Popcorn Dome, along with the corresponding ages of the sampled units.
r
r samwell_site <- get_site(sitename=Cave) #there’s one site in the database that matches this samwell_datasets <- get_dataset(samwell_site[[1]]) # this search lists all datasets associated with Samwell Cave. There are 7 different datasets for the site: 4 containing vertebrate faunal information and 3 containing geochronologic information. browse(samwell_datasets) # Note that this opens a new window in your browser
scpd_dataset <- get_dataset(samwell_site[[1]])$‘14262’ scpd_download <- get_download(scpd_dataset)
scpdFauna <- counts(scpd_download)[[1]] scpdAges <- ages(scpd_download)[[1]]
OK, now that we have the basic dataset downloaded, we will need to clean up the data.
Question:
What kinds of data cleaning might we want to do here?
It’s often helpful to examine the structure of the data to answer this question.
r
r # examine the final downloaded file str(scpdFauna) # another possible way to examine the data: glimpse(scpdFauna)
'data.frame': 14 obs. of 22 variables:
$ Carnivora : num 1 0 0 0 0 0 0 0 0 0 ...
$ cf. Scapanus latimanus : num 1 1 2 3 2 1 1 0 3 2 ...
$ Chiroptera : num 3 8 5 13 9 5 14 3 11 7 ...
$ Leporidae : num 3 2 0 0 1 0 2 2 1 0 ...
$ Microtus sp. : num 38 30 15 51 40 31 32 13 15 19 ...
$ Neotoma sp. : num 32 44 28 35 28 20 39 14 28 24 ...
$ Peromyscus sp. : num 151 171 136 220 203 112 184 77 132 114 ...
$ Sorex sp. : num 4 2 2 3 6 0 2 0 4 0 ...
$ Spermophilus beecheyi : num 1 1 2 4 4 1 5 3 4 2 ...
$ Tamias sp. : num 1 0 0 1 1 0 0 1 1 3 ...
$ Tamiasciurus douglasii : num 1 1 1 2 1 2 1 0 3 2 ...
$ Thomomys bottae : num 5 9 2 6 6 1 1 3 1 1 ...
$ Thomomys sp. : num 14 18 23 42 25 19 24 18 21 30 ...
$ cf. Arborimus albipes : num 0 1 0 0 0 0 1 1 1 1 ...
$ Reithrodontomys cf. R. megalotis: num 0 1 1 1 0 0 0 0 1 0 ...
$ Sciurus griseus : num 0 1 1 3 1 1 2 0 3 0 ...
$ Spermophilus lateralis : num 0 2 0 1 0 1 1 0 2 1 ...
$ Glaucomys cf. G. sabrinus : num 0 0 0 2 1 0 1 0 0 0 ...
$ Thomomys cf. T. mazama : num 0 0 0 0 2 0 3 0 1 5 ...
$ Artiodactyla : num 0 0 0 0 0 0 2 0 0 0 ...
$ Myodes cf. M. californicus : num 0 0 0 0 0 0 2 0 0 0 ...
$ Aplodontia rufa : num 0 0 0 0 0 0 0 0 0 0 ...
r
r head(scpdFauna)
Start cleaning up the data files.
r
r # What are the rownames showing?
# Let’s replace the sample ID in the fauna object (rownames) with the ages # but only do this if the sample IDs match across objects ages <- scpdAges\(age if (all(rownames(scpdFauna) == scpdAges\)sample.id)){ rownames(scpdFauna) <- ages } # set the names of the small mammal genera of interest # note: this is a clunky workaround - ideally we’d be able to harmonize taxonomy using the database/API tools, but they aren’t as good in Neotoma as in PBDB because of the way taxonomy is handled sm_genus <- c(, , , , , , , , , , , , , , ) # create a new dataframe with the merged data scpdSM <- matrix(nrow=nrow(scpdFauna), ncol=length(sm_genus)) for (i in 1:length(sm_genus)){ cols <- grep(sm_genus[i], colnames(scpdFauna)) if (length(cols)==1){ scpdSM[,i] <- scpdFauna[,cols] } if (length(cols) > 1){ scpdSM[,i] <- rowSums(scpdFauna[,cols]) }
} # clean up dataframe colnames(scpdSM) <- sm_genus rownames(scpdSM) <- rownames(scpdFauna) save(scpdSM, file=/scpdFauna_cleaned.RData)
Rarefaction-based estimates
First, let’s calculate raw richness, for each sample through time
r
r # we can practice building our own functions to count up the species countFcn <- function(df){ count <- length(which(df>0)) return(count) } raw.S <- apply(scpdSM, 1, countFcn) # or you can do the same calculation using a built-in function (specnumber) in vegan specnumber(scpdSM)
752.31 2758.47 4639.2 5893.1 7021.6 7899.3 8902.3 10156.2 11410 12663.9 13917.7 15171.6
9 12 11 12 11 9 13 7 12 9 13 11
16425.4 17679.3
12 12
However, we know these are fossil data and (even if they weren’t!) sampling effort may have varied through time and across samples. ie, the total sample size (N) of each sample (stratum) varies substantially, as can be seen by examining the range of total N.
r
r range(rowSums(scpdSM))
[1] 130 376
r
r minN <- min(rowSums(scpdSM))
So one approach is rarefy each assemblage down to the minimum N when calculating richness. This effectively answers: What is the expected richness of an assemblage if we sampled the same number of individuals as were observed in some other assemblage? You did this manually in Seth’s section, and then he also introduced you to the function rarefy in the vegan package. rarefy “gives the expected species richness in random subsamples of size sample from the community” (from help doc) and performs this for all samples/assemblages in the community matrix.
r
r richness <- rarefy(scpdSM, sample=minN, se=T) # because I specified se=T, this function outputs both S and se S <- richness[1,] S.se <- richness[2,] # plot the result through time using base R plotting function plot(S ~ ages, type=, xlim=c(21000, 0), ylim=c(0, floor(max(S+S.se))+1), xlab=(cal BP), ylab=expression(‘Richness’ %+-% ‘se’)) #arrows(ages, S+S.se, ages, S-S.se, code=3, length=0.1, angle=90) suppressWarnings(arrows(ages, S+S.se, ages, S-S.se, code=3, length=0.1, angle=90))
r
r # Note: use suppressWarnings with caution. I added the suppressWarnings tag so that the only thing output is the plot. But I only did this after I confirmed that the warning message was ok. Switch the arrows lines that are commented/uncommented and re-run the plot with that line of code for arrows to view the warning and decide for yourself whether it should be suppressed or not.
Let’s compare rarefied to raw richness, for each sample through time
r
r plot(raw.S ~ ages, type=, xlim=c(21000, 0), ylim=c(0, max(raw.S)+1), lty=2, col=, xlab=(cal BP), ylab=‘Richness’) lines(S ~ ages, type=) r arrows(ages, S+S.se, ages, S-S.se, code=3, length=0.1, angle=90)
zero-length arrow is of indeterminate angle and so skipped
We can also compare the sampling curves using function rarecurve. This grabs random samples of different numbers of individuals in a step-wise function, then calculates richness for each sample.
r
r # set colors, from cool for the older (LGM) times and red for the warmer (Holocene) times ageColors <- rainbow(length(ages), start=0, end=4/6) rarecurve(scpdSM, col=ageColors)
Extrapolation-based estimates of richness
Rarefaction calculates the expected richness if the same number of individuals were sampled across different samples. However, there are other ways to estimate richness and compare among samples. Among the more common alternatives to rarefaction are the Chao and ACE (Abundance-based Coverage Estimator) estimators of richness. Let’s first calculate them, and then discuss.
r
r ?estimateR # read more about this function, and how ACE and Chao1 are calculated alt_richness <- estimateR(scpdSM) # Bind all richness estimates together and compare: richness_all <- rbind(alt_richness, richness) # plot them against one another y.max <- floor(max(richness_all))+1 # set y max for plotting mainEstimates <- c(‘S.obs’, ‘S.chao1’, ‘S.ACE’, ‘S’) plotRows <- match(mainEstimates, rownames(richness_all)) plotCols <- c(, , , )
i=1 plot(richness_all[plotRows[i],] ~ ages, type=, ylim=c(0, y.max), xlim=c(21000, 0), xlab=(cal BP), ylab=, col=plotCols[i]) for (i in 2:length(plotRows)){ lines(richness_all[plotRows[i],] ~ ages, col=plotCols[i]) } r legend(, legend=mainEstimates, lty=1, col=plotCols, bty=, cex=0.75)
Discussion
You can see that these two approaches (rarefaction vs extrapolation) produce fundamentally different estimates of the trend of species richness through time. Individual-based rarefaction (S) shows a trend towards lower richness from the Pleistocene to the Holocene, whereas ACE and Chao1 estimates show a trend towards higher richness from the Pleistocene to the Holocene.
So what’s the difference? Individual-based rarefaction (S) estimates expected richness given a subsample of n individuals from the assemblage. It’s a simple form of standardization so that different samples are comparable. In this case, n is sampled at the minimum N observed among the samples (N=130). And since n is generally lower than the total N sampled, it’s really an underestimate of on-the-ground richness.
On the other hand, Chao1 and ACE estimate the asymptotic richness estimates. ie, they both try to extrapolate beyond the sampled individuals to figure out the expected richness if the assemblage was sampled completely. They focus in particular on information in the detected rare species (singletons, etc) to estimate undetected species.
Chao1 and ACE have been shown to perform better than other estimators in various studies. They are also similar to (or fundamentally the same as) SQS. A recent study (Close et al. 2018) examined differences among richness estimators and found “sampling in the tetrapod fossil record is generally not yet good enough to use extrapolators unless they are applied within a rarefy‐and‐extrapolate protocol of the kind used in our simulation experiments” and “In particular, rarefying extrapolators to equal coverage is the best way to remove confounding effects of among‐sample variation in evenness, a problem that affects all richness estimators when sampling is comparatively limited”. The package iNEXT implements this rarefying extrapolators for Chao1, so let’s do a quick tour through that package.
r
r # this chunk uses the iNext package vignette(, package=) iNextRichness <- iNEXT(t(scpdSM)) # Note 1: we had to transpose the community matrix # Note 2: this calculates not just richness, but also Shannon and Simpson diversities. # You can also rarefy within iNEXT using estimateD D_minN <- estimateD(t(scpdSM), datatype=, base=, level=NULL) D_minN r # Note 1: SC = sample coverage, qD is the estimate with lower and upper confidence levels # Note 2: Richness estimates here are the same as using rarefy in vegan. e.g., compare the following: D_minN[which(D_minN$order == 0),]$qD
[1] 7.288 8.308 8.971 9.257 8.342 8.059 9.174 7.000 10.889 8.306 11.281 10.424 10.190
[14] 10.502
r
r richness[1,]
752.31 2758.47 4639.2 5893.1 7021.6 7899.3 8902.3 10156.2 11410
7.288080 8.308185 8.971041 9.257188 8.341658 8.058694 9.173950 7.000000 10.889424
12663.9 13917.7 15171.6 16425.4 17679.3
8.305880 11.281020 10.424141 10.190165 10.502225
Plot the Chao estimators
r
r ggiNEXT(iNextRichness) ggiNEXT(iNextRichness, type=1, facet.var=) ggiNEXT(iNextRichness, type=2)
OK, up to now we’ve just been counting the number of species in different ways. The other aspect of assemblages is their abundance structure - how abundance varies among the different species in the deposit.
Diversity
There are several common diversity measures, which combine both richness and evenness. I think it’s helpful to view the calculations for these by looking at one of the vegan vignettes
r
r vignette(-vegan) H <- diversity(scpdSM, index=) Simp <- diversity(scpdSM, index=)
Practice calculating Shannon on your own
r
r # Shannon pi.SCPD <- scpdSM/rowSums(scpdSM) pi.SCPDxlogpi <- pi.SCPD * log(pi.SCPD) sums_shan <- rowSums(pi.SCPDxlogpi, na.rm=T) H_calc <- -sums_shan all(H==H_calc)
[1] TRUE
r
r # Simpson pi.SCPD2 <- pi.SCPD^2 sums_simp <- rowSums(pi.SCPD2, na.rm=T) Simp_calc <- 1-sums_simp all(Simp==Simp_calc)
[1] TRUE
Evenness
You were introduced to evenness is Seth’s lecture. There aren’t good built-in functions for calculating evenness (that I know of), but Pielou’s (or Shannon) evenness can be derived from Shannon diversity (H). (H/ln(S)). The idea underlying both of these measures is that if diversity incorporates both richness and evenness, then factoring out richness from the diversity measures should leave us with evenness.
r
r Shan_even <- H/log(specnumber(scpdSM)) Simp_even <- Simp/log(specnumber(scpdSM))
Let’s plot these through time
r
r plotCols <- c(, , , ) plot(H~ages, type=, ylim=c(0, 2), xlim=c(21000, 0), xlab=(cal BP), ylab=, col=plotCols[1]) lines(Simp~ages, col=plotCols[2]) r lines(Shan_even~ages, col=plotCols[3], lty=2) lines(Simp_even~ages, col=plotCols[4], lty=2)
r legend(, legend=c(, , _Shannon, _Simpson), lty=c(1,1,2,2), col=plotCols, bty=, cex=0.75)
Note: you could also calculate Shannon and Simpson through iNext (as above), and probably using several other packages.
Let’s ask: How do richness, evenness, and diversity vary across latitude and longitude? We will use a dataset of Quaternary fossil pollen data from eastern North America for 6000 years before present.
Note, this is not count data - the data are provided as the relative abundance of each taxon at the site - so the rarefaction and extrapolation estimators don’t work. They only accept count data / integers.
Data loading and prep
r
r # load the data and examine the structure pollen.6000 <- read.csv(file=/pollen/6000.interp.data_clean_sites.csv) str(pollen.6000)
'data.frame': 369 obs. of 108 variables:
$ sites : Factor w/ 369 levels \3PINES\,\ALEXISLK\,..: 1 2 3 4 5 6 7 8 9 10 ...
$ Sambucus : num 0 0 0 0 0 ...
$ Viburnum : num 0 0 0 0 0 0 0 0 0 0 ...
$ Liquidambar : num 0 0 0 0 0.00645 ...
$ Salsola : int 0 0 0 0 0 0 0 0 0 0 ...
$ Allium : int 0 0 0 0 0 0 0 0 0 0 ...
$ Ilex : num 0 0 0 0 0 0 0 0 0 0 ...
$ Nemopanthus : num 0 0 0 0 0 0 0 0 0 0 ...
$ Ambrosia.type : num 0 0 0 0 0.00216 ...
$ Artemisia : num 0.04612 0 0 0.00282 0 ...
$ Bidens : num 0 0 0 0 0 0 0 0 0 0 ...
$ Iva : num 0 0 0 0 0 ...
$ Xanthium : num 0 0 0 0 0.000714 ...
$ Alnus : num 0.0777 0 0.0384 0.4222 0.5674 ...
$ Corylus : num 0 0.001024 0 0 0.000714 ...
$ Ostrya.Carpinus : num 0.08625 0 0 0 0.00574 ...
$ Betula : num 0.1981 0.1422 0.2306 0.2425 0.0187 ...
$ Bursera : int 0 0 0 0 0 0 0 0 0 0 ...
$ Opuntia : num 0 0 0 0 0 0 0 0 0 0 ...
$ Celtis : num 0 0 0 0 0 ...
$ Trema : num 0 0 0 0 0 0 0 0 0 0 ...
$ Diervilla : num 0 0 0 0 0 0 0 0 0 0 ...
$ Lonicera : num 0 0 0 0 0 0 0 0 0 0 ...
$ Linnaea.borealis : int 0 0 0 0 0 0 0 0 0 0 ...
$ Symphoricarpos : num 0 0 0 0 0 0 0 0 0 0 ...
$ Stellaria : int 0 0 0 0 0 0 0 0 0 0 ...
$ Clethra : num 0 0 0 0 0 0 0 0 0 0 ...
$ Cornus : num 0 0 0 0 0 0 0 0 0 0 ...
$ Nyssa : num 0 0 0 0 0.00286 ...
$ Juniperus.Thuja : num 0 0 0 0 0 0 0 0 0 0 ...
$ Taxodium : num 0 0 0 0 0 ...
$ Cyrilla.racemiflora : num 0 0 0 0 0 0 0 0 0 0 ...
$ Diapensia : int 0 0 0 0 0 0 0 0 0 0 ...
$ Shepherdia : num 0 0 0 0 0 0 0 0 0 0 ...
$ Ephedra : num 0 0 0 0 0 ...
$ Arctostaphylos : num 0 0 0 0 0 0 0 0 0 0 ...
$ Empetrum : num 0 0 0 0 0 0 0 0 0 0 ...
$ Rhododendron : num 0 0 0 0 0 0 0 0 0 0 ...
$ Chamaedaphne.calyculata: num 0 0 0 0 0 0 0 0 0 0 ...
$ Vaccinium : num 0 0 0 0 0 0 0 0 0 0 ...
$ Amorpha : num 0 0 0 0 0 0 0 0 0 0 ...
$ Dalea : num 0 0 0 0 0 0 0 0 0 0 ...
$ Trifolium : int 0 0 0 0 0 0 0 0 0 0 ...
$ Melilotus : int 0 0 0 0 0 0 0 0 0 0 ...
$ Castanea : num 0 0 0 0 0.00715 ...
$ Fagus : num 0 0 0 0 0.00359 ...
$ Quercus : num 0.09733 0.00564 0 0 0.25066 ...
$ Geranium : int 0 0 0 0 0 0 0 0 0 0 ...
$ Ribes : int 0 0 0 0 0 0 0 0 0 0 ...
$ Hypericum : num 0 0 0 0 0 0 0 0 0 0 ...
$ Iris : int 0 0 0 0 0 0 0 0 0 0 ...
$ Sisyrinchium : int 0 0 0 0 0 0 0 0 0 0 ...
$ Itea : num 0 0 0 0 0 0 0 0 0 0 ...
$ Carya : num 0 0 0 0 0.0387 ...
$ Engelhardtia : int 0 0 0 0 0 0 0 0 0 0 ...
$ Juglans : num 0 0 0 0 0 0 0 0 0 0 ...
$ Pterocarya : num 0 0 0 0 0 0 0 0 0 0 ...
$ Stachys : int 0 0 0 0 0 0 0 0 0 0 ...
$ Lycopus : num 0 0 0 0 0 0 0 0 0 0 ...
$ Mentha : int 0 0 0 0 0 0 0 0 0 0 ...
$ Lythrum : int 0 0 0 0 0 0 0 0 0 0 ...
$ Magnolia : num 0 0 0 0 0 0 0 0 0 0 ...
$ Liriodendron : num 0 0 0 0 0 0 0 0 0 0 ...
$ Tilia : num 0 0 0 0 0 ...
$ Waltheria : int 0 0 0 0 0 0 0 0 0 0 ...
$ Sphaeralcea : num 0 0 0 0 0 0 0 0 0 0 ...
$ Fraxinus : num 0 0 0 0 0 ...
$ Osmanthus : int 0 0 0 0 0 0 0 0 0 0 ...
$ Epilobium : num 0 0 0 0 0 0 0 0 0 0 ...
$ Abies : num 0.0863 0.0909 0.16 0.0538 0 ...
$ Larix : num 0.0692 0 0 0 0 ...
$ Picea : num 0.15199 0.75481 0.56231 0.25602 0.00215 ...
$ Tsuga : num 0.00853 0 0 0 0.0043 ...
$ Platanus : num 0 0 0 0 0 0 0 0 0 0 ...
$ Polygonella : int 0 0 0 0 0 0 0 0 0 0 ...
$ Polygonum : num 0 0 0 0 0.00144 ...
$ Eriogonum : num 0 0 0 0 0 0 0 0 0 0 ...
$ Fagopyrum : int 0 0 0 0 0 0 0 0 0 0 ...
$ Koenigia : int 0 0 0 0 0 0 0 0 0 0 ...
$ Rumex.Oxyria : int 0 0 0 0 0 0 0 0 0 0 ...
$ Anemone : num 0 0 0 0 0 0 0 0 0 0 ...
$ Ranunculus : num 0 0 0 0 0 0 0 0 0 0 ...
$ Thalictrum : num 0 0 0 0 0.000719 ...
$ Rhizophora : int 0 0 0 0 0 0 0 0 0 0 ...
$ Amelanchier : int 0 0 0 0 0 0 0 0 0 0 ...
$ Dryas : int 0 0 0 0 0 0 0 0 0 0 ...
$ Geum : int 0 0 0 0 0 0 0 0 0 0 ...
$ Potentilla : int 0 0 0 0 0 0 0 0 0 0 ...
$ Sanguisorba : num 0 0 0 0 0 0 0 0 0 0 ...
$ Spiraea : int 0 0 0 0 0 0 0 0 0 0 ...
$ Prunus : int 0 0 0 0 0 0 0 0 0 0 ...
$ Cephalanthus : num 0 0 0 0 0.0751 ...
$ Galium : num 0 0 0 0 0 0 0 0 0 0 ...
$ Populus : num 0 0 0 0 0.00571 ...
$ Salix : num 0.04612 0.00102 0.00229 0.00679 0.00501 ...
$ Comandra : int 0 0 0 0 0 0 0 0 0 0 ...
$ Acer : num 0 0 0 0 0 0 0 0 0 0 ...
$ Sarcobatus : num 0 0 0 0 0 0 0 0 0 0 ...
$ Symplocos : int 0 0 0 0 0 0 0 0 0 0 ...
[list output truncated]
r
r # clean up the site/rownames pollen.6000.sites <- pollen.6000\(sites rownames(pollen.6000) <- pollen.6000\)sites pollen.6000 <- pollen.6000[,-1]
Richness, diversity, and evenness calculations
r
r # calculate richness and diversity S.obs <- specnumber(pollen.6000) H <- diversity(pollen.6000, ) Simp <- diversity(pollen.6000, ‘simpson’) H_even <- H/log(S.obs)
OK, now examine how diversity changes across lat and long
r
r # get site lat/longs meta <- read.delim(file=/pollen/All.site.data-withagemodel-finalv2.txt, sep=\t) site.longlats <- meta[match(pollen.6000.sites, meta$Handle), c(‘Longitude’, ‘Latitude’)] # simple relationships with latitude and longitude par(mfrow=c(2,2), mar=c(4,4,1,1)+0.1) plot(H~site.longlats\(Latitude, pch=16, xlab=\Latitude\) plot(S.obs~site.longlats\)Latitude, pch=16, xlab=) r plot(Simp~site.longlats\(Latitude, pch=16, xlab=\Latitude\) plot(H_even~site.longlats\)Latitude, pch=16, xlab=)
r par(mfrow=c(2,2), mar=c(4,4,1,1)+0.1)
r
r plot(H~site.longlats\(Longitude, pch=16, xlab=\Longitude\) plot(S.obs~site.longlats\)Longitude, pch=16, xlab=) r plot(Simp~site.longlats\(Longitude, pch=16, xlab=\Longitude\) plot(H_even~site.longlats\)Longitude, pch=16, xlab=)
Note: In the ‘data’ folder, you have similar pollen files for multiple time periods.
Questions:
What else could you do with these data? Explore additional diversity analyses, either using these data for 6000 years BP, or by looking at the datasets for other times.
Use the knowledge you’ve gained from today or previous modules in the course to extract information from other sites in the PaleobioDB or Neotoma and calculate diversity.
One other common way to visualize the structure of communities is through species abundance distributions (SADs), and related rank abundance distributions (RADs). This is a way to visualize the differences in commonness and rarity among species, and can potentially be compared among communities. SADs/RADs have a long history in ecology, and the general pattern is that to be rare is common and to be common is rare. That is, very few species are highly abundant, and many more species have low abundances.
r
r # this chunk uses the sads package vignette(‘sads_intro’)
Let’s revisit the small mammal data from Samwell Cave and answer the question: has the structure of the community changed through time? We’ll focus mostly on RADs.
r
r # examine the two endpoints scpd1.oc <- octav(scpdSM[1,]) scpd14.oc <- octav(scpdSM[14,]) scpd1.oc
Object of class \octav\
r
r scpd14.oc
Object of class \octav\
r
r par(mfrow=c(4,4), mar=c(3,3,2,0)+0.1) for (i in 1:nrow(scpdSM)){ plot(rad(scpdSM[i,]), prop=F, xlab=\, ylab=\, xlim=c(1, 13), type=, pch=16) mtext(side=2, line=2, of Indiv, cex=0.5) mtext(side=3, line=0.5, paste0(=, ages[i]), cex=0.75) mtext(side=1, line=2, , cex=0.5) }
Question:
What do the figures above show us?
Alternate approach
McGill (2011, Ch. 9) identifies a few problems with simplistic visual comparisons of RADs that are visible in the figure above.
He suggests instead using ECDFs (empirical cumulative density functions) instead.
r
r par(mfrow=c(4,4), mar=c(3,3,2,0)+0.1) for (i in 1:nrow(scpdSM)){ plot(ecdf(scpdSM[i,]), xlab=\, ylab=\, xlim=c(1, 13), main=\, verticals=T, do.points=F, cex.axis=0.75) mtext(side=2, line=2, % of Species, cex=0.5) mtext(side=3, line=0.5, paste0(=, ages[i]), cex=0.75) mtext(side=1, line=2, % abundance, cex=0.5) }
Formal model fitting
It’s possible using the sads package (and also vegan, though the approaches differ) to fit various models to the SADs/RADs. There are quite a few different models to choose from, and different authors have postulated that different distributions are indicative of different types of communities. I’ve randomly chosen three: geometric (low diversity), Fisher’s log-series (medium diversity), and lognormal (high diversity).
r
r # fit the models scpd1.geom <- fitsad(scpdSM[1,scpdSM[1,]>0], ‘geom’) scpd1.ls <- fitsad(scpdSM[1,scpdSM[1,]>0], ‘ls’) scpd1.lnorm <- fitsad(scpdSM[1,scpdSM[1,]>0], ‘lnorm’) # compare the fit of each model AICtab(scpd1.geom, scpd1.ls, scpd1.lnorm, base=TRUE)
AIC dAIC df
scpd1.ls 68.6 0.0 1
scpd1.lnorm 74.3 5.7 2
scpd1.geom 79.4 10.8 1
r
r # transform to appropriate class for plotting scpd1.geom.oc <- octavpred(scpd1.geom) scpd1.ls.oc <- octavpred(scpd1.ls) scpd1.lnorm.oc <- octavpred(scpd1.lnorm) # plot the three models against one another and the data plot(scpd1.oc) lines(scpd1.geom.oc, col=) r lines(scpd1.ls.oc, col=) lines(scpd1.lnorm.oc, col=)
r legend(, c(, , ), lty=1, col=c(,, ))